Transformation matrices

January 26, 2025

In the previous article, we saw how to compose multiple transformations by applying them in sequence:

const star = array<vec2f, 30>(
13 collapsed lines
  // Outer star points
  vec2f(0.000000, -0.197140), vec2f(0.309011, -0.425297), vec2f(0.187467, -0.060929),
  vec2f(0.187467, -0.060929), vec2f(0.500000, 0.162443), vec2f(0.115866, 0.159500),
  vec2f(0.115866, 0.159500), vec2f(0.000000, 0.525707), vec2f(-0.115866, 0.159500),
  vec2f(-0.115866, 0.159500), vec2f(-0.500000, 0.162443), vec2f(-0.187467, -0.060929),
  vec2f(-0.187467, -0.060929), vec2f(-0.309011, -0.425297), vec2f(0.000000, -0.197140),

  // Inner star points
  vec2f(0.000000, -0.197140), vec2f(0.000000, 0.000000), vec2f(0.187467, -0.060929),
  vec2f(0.187467, -0.060929), vec2f(0.000000, 0.000000), vec2f(0.115866, 0.159500),
  vec2f(0.115866, 0.159500), vec2f(0.000000, 0.000000), vec2f(-0.115866, 0.159500),
  vec2f(-0.115866, 0.159500), vec2f(0.000000, 0.000000), vec2f(-0.187467, -0.060929),
  vec2f(-0.187467, -0.060929), vec2f(0.000000, 0.000000), vec2f(0.000000, -0.197140)
);

@vertex
fn vs(@builtin(vertex_index) index: u32) -> @builtin(position) vec4f {
  // Step 1: Scale
  let sX = 0.5;
  let sY = 0.5;
  var transformed = vec2f(
    star[index].x * sX,
    star[index].y * sY
  );

  // Step 2: Rotation
  let theta = radians(45.0);
  let cosT = cos(theta);
  let sinT = sin(theta);
  transformed = vec2f(
    transformed.x * cosT - transformed.y * sinT,
    transformed.x * sinT + transformed.y * cosT
  );

  // Step 3: Translation
  let tX = 0.2;
  let tY = -0.1;
  transformed += vec2f(tX, tY);

  return vec4f(transformed, 0.0, 1.0);
}

The implementation, however, is not very flexible because it is limited to the composition of scale, rotation, and translation. If we wanted to apply different transformations, or change their order, we would be forced to modify the source code each time.

In this article, we will define geometric transformations using matrices, a tool from linear algebra. Being able to uniformly represent transformations and compositions, matrices make the code generic and reusable for any sequence of transformations.

The final code is available in the 04-matrix branch of the GitHub repository.

#Matrices

A rectangular matrix $M = (m_{ij})$ of dimensions $m \times n$ is a “table” of real numbers:

M = \begin{bmatrix} m_{11} & m_{12} & \cdots & m_{1n} \\ m_{21} & m_{22} & \cdots & m_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ m_{m1} & m_{m2} & \cdots & m_{mn} \end{bmatrix}

where $m$ is the number of rows, $n$ the number of columns, and $m_{ij}$ are the elements of the matrix.

A matrix of dimensions $1 \times n$ is called a row matrix:

\begin{bmatrix} m_{11} & m_{12} & \cdots & m_{1n} \end{bmatrix}

While a matrix of dimensions $m \times 1$ is called a column matrix:

\begin{bmatrix} m_{11} \\ m_{21} \\ \vdots \\ m_{m1} \end{bmatrix}

Finally, a matrix of dimensions $n \times n$ is called a square matrix of order $n$ :

\begin{bmatrix} m_{11} & m_{12} & m_{13} & \cdots & m_{1n} \\ m_{21} & m_{22} & m_{23} & \cdots & m_{2n} \\ m_{31} & m_{32} & m_{33} & \cdots & m_{3n} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ m_{n1} & m_{n2} & m_{n3} & \cdots & m_{nn} \end{bmatrix}

Below are shown, in order, a $2 \times 3$ rectangular matrix, a square matrix of order $2$ , and a row matrix:

\begin{bmatrix} \pi & -e & 1 \\ 0 & 5 & -\frac{1}{2} \end{bmatrix} , \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix} , \begin{bmatrix} -1 & 0 & 5 \end{bmatrix}

#Matrix product

Given a matrix $A = (a_{ij})$ of dimensions $m \times n$ and a matrix $B = (b_{ij})$ of dimensions $n \times p$ , we call the product of $A$ and $B$ a matrix $C = (c_{ij})$ of dimensions $m \times p$ defined as:

c_{ij} = \sum_{k=1}^{n} a_{ik} b_{kj}

In other words, the element $c_{ij}$ is obtained by combining the $i$ -th row of matrix $A$ with the $j$ -th column of matrix $B$ . For example:

\begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix} \begin{bmatrix} b_{11} & b_{12} \\ b_{21} & b_{22} \end{bmatrix} = \begin{bmatrix} a_{11}b_{11} + a_{12}b_{21} & a_{11}b_{12} + a_{12}b_{22} \\ a_{21}b_{11} + a_{22}b_{21} & a_{21}b_{12} + a_{22}b_{22} \end{bmatrix}

Product of two 2x2 matrices. Tap or click on elements to highlight the rows/columns involved in the calculation.

Where the individual elements are:

\begin{align*} c_{11} &= a_{11} b_{11} + a_{12} b_{21} \\ c_{12} &= a_{11} b_{12} + a_{12} b_{22} \\ c_{21} &= a_{21} b_{11} + a_{22} b_{21} \\ c_{22} &= a_{21} b_{12} + a_{22} b_{22} \end{align*}

The product is defined only between “compatible” matrices, that is, when the number of columns of the first matrix equals the number of rows of the second. The result will have the number of rows of the first matrix and the number of columns of the second. A simpler way to remember this rule is the example:

[m \times n] \cdot [n \times p] = [m \times p]

#Transposition

We call the transpose matrix or simply transpose of $M$ the matrix $M^T = (m_{ji})$ of dimensions $n \times m$ obtained by swapping rows with columns. For example:

\begin{pmatrix} 1 & 2 \\ 3 & 4 \\ 5 & 6 \end{pmatrix}^T = \begin{pmatrix} 1 & 3 & 5 \\ 2 & 4 & 6 \end{pmatrix}

The transpose of the product of two matrices equals transposing the individual matrices and swapping their order:

(A \cdot B)^T = B^T \cdot A^T

A matrix $M$ of dimensions $n \times n$ is called symmetric if $M = M^T$ , that is, if it coincides with its transpose:

\begin{bmatrix} 3 & 4 \\ 4 & 5 \end{bmatrix}^T = \begin{bmatrix} 3 & 4 \\ 4 & 5 \end{bmatrix}

The elements are symmetric with respect to the main diagonal ( $\searrow$ ).

#Identity matrix

The identity matrix is a square matrix in which all elements of the main diagonal equal $1$ , while the others are $0$ :

I_n = \begin{bmatrix} 1 & 0 & 0 & \cdots & 0 \\ 0 & 1 & 0 & \cdots & 0 \\ 0 & 0 & 1 & \cdots & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & \cdots & 1 \end{bmatrix}

The identity matrix is the neutral element with respect to the product operation, that is:

A \cdot I_n = I_n \cdot A = A

where $A$ is a square matrix of order $n$ .

#Inverse matrix

A square matrix of order $n$ is said to be invertible if there exists a square matrix $M^{-1}$ of order $n$ such that:

M^{-1} \cdot M = M \cdot M^{-1} = I_n

In that case, $M^{-1}$ is called the inverse of $M$ . A non-invertible matrix is also called singular.

#Point representation

In the following sections, we will use matrices to define transformation functions. In order for these transformations, expressed in matrix form, to act on points, it is necessary to represent the points themselves in a mathematical format compatible with matrix operations.

A point is represented by its coordinates, which we can denote using a vector, a row matrix, or a column matrix:

(x, y) , \begin{bmatrix} x & y \end{bmatrix} , \begin{bmatrix} x \\ y \end{bmatrix}

All three notations define the exact same concept; the difference is only at the notation level. Therefore, we will consider the three forms as equivalent.

#Linear transformations

A linear transformation is a function of the form:

f(x, y) = (a x + b y, c x + d y)

where $a, b, c, d$ are real coefficients. The name derives from the fact that the transformed coordinates are a linear combination of the original point’s coordinates. Linear transformations can be expressed as the product of a square matrix of order $2$ and a column matrix:

f(x, y) = \begin{bmatrix} a & b \\ c & d \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix}

Alternatively, it can be defined as the product of a row matrix and a square matrix of order $2$ :

f(x, y) = \begin{bmatrix} x & y \end{bmatrix} \begin{bmatrix} a & c \\ b & d \end{bmatrix}

In the following sections, we will describe points using column matrices, so we will use the first form.

Linear transformations preserve parallelism between lines, but not angles and areas: this means that parallel lines before the transformation will remain parallel after, but angles formed by pairs of lines could change their amplitude and geometric figures could undergo a variation in their area.

Un quadrato trasformato mediante scala, taglio, rotazione e traslazione. Il parallelismo tra i lati è mantenuto, ma non gli angoli e l'area. La figura tratteggiata è quella originale.

In other words, if we take a parallelogram, after a linear transformation it will remain a parallelogram, but it might no longer be a rectangle or a square. The directions of lines are modified, and distances are rescaled in a non-uniform way, causing angle distortion and area variation.

#Scale

The scale transformation

f_S(x, y) = (x \cdot s_x, y \cdot s_y)

can be written as:

f_S(x, y) = \begin{bmatrix} s_x & 0 \\ 0 & s_y \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix}

Performing the matrix product yields:

\begin{bmatrix} s_x & 0 \\ 0 & s_y \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix} s_x \cdot x + 0 \cdot y \\ 0 \cdot x + s_y \cdot y \end{bmatrix} = \begin{bmatrix} s_x \cdot x \\ s_y \cdot y \end{bmatrix}

#Rotation

The rotation

f_R(x, y) = (x \cdot \cos (\theta) - y \cdot \sin (\theta), x \cdot \sin (\theta) + y \cdot \cos (\theta))

can be written as:

f_R(x, y) = \begin{bmatrix} \cos(\theta) & -\sin(\theta) \\ \sin(\theta) & \cos(\theta) \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix}

Performing the matrix product yields:

\begin{bmatrix} \cos(\theta) & -\sin(\theta) \\ \sin(\theta) & \cos(\theta) \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix} \cos(\theta) \cdot x - \sin(\theta) \cdot y \\ \sin(\theta) \cdot x + \cos(\theta) \cdot y \end{bmatrix}

#Shear

The shear transformation

f_{SH}(x, y) = (x + sh_x \cdot y, y + sh_y \cdot x)

can be written as:

f_{SH}(x, y) = \begin{bmatrix} 1 & sh_x \\ sh_y & 1 \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} \\

Performing the matrix product yields:

\begin{bmatrix} 1 & sh_x \\ sh_y & 1 \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix} 1 \cdot x + sh_x \cdot y \\ sh_y \cdot x + 1 \cdot y \end{bmatrix} = \begin{bmatrix} x + sh_x \cdot y \\ y + sh_y \cdot x \end{bmatrix}

#Matrices in WGSL

The WGSL language supports matrices of dimensions between $2 \times 2$ and $4 \times 4$ with the primitive types matCxRf, where $C$ is the number of columns and $R$ the number of rows. The notation is reversed compared to mathematical convention, which indicates the number of rows first and then the number of columns. For example, the type mat2x4f represents a $4 \times 2$ matrix (four rows and two columns):

\begin{bmatrix} m_{11} & m_{12} \\ m_{21} & m_{22} \\ m_{31} & m_{32} \\ m_{41} & m_{42} \end{bmatrix}

The binary encoding of matCxRf types is column-major: this means that matCxRf is structured in memory as an array of $C$ vectors of length $R$ . For example, mat2x4f is conceptually equivalent to array<vec4f, 2>:

[(m_{11}, m_{21}, m_{31}, m_{41}), \, (m_{12}, m_{22}, m_{32}, m_{42})]

Care must be taken not to confuse the mathematical notation of matrices, which is row-major, with the binary representation in WGSL, which is column-major. To better understand this concept, let’s examine a non-symmetric matrix, such as the shear matrix:

shearMatrix(sh_x, sh_y) = \begin{bmatrix} 1 & sh_x \\ sh_y & 1 \end{bmatrix}

The mat2x2f constructor follows the column-major representation and therefore accepts elements by column, not by row:

fn shearMatrix(shx: f32, shy: f32) -> mat2x2f {
  return mat2x2f(1, shy, shx, 1);
}

fn shearMatrixTransposed(shx: f32, shy: f32) -> mat2x2f {
  return mat2x2f(
    1, shx,
    shy, 1
  );
}

The indentation of the shearMatrixTransposed function visually mirrors the mathematical notation, which is row-major, but the result in memory will be the transposed matrix:

shearMatrixTransposed(sh_x, sh_y) = \begin{bmatrix} 1 & sh_y \\ sh_x & 1 \end{bmatrix}

If we had considered a symmetric matrix, there would have been no difference, because, by definition, the transpose of a symmetric matrix coincides with the matrix itself.

#Product in WGSL

The product in WGSL is defined both between matrices and between matrices and vectors. We can update our example shader by replacing the rotation and scale transformations with matrices:

fn scaleMatrix(sX: f32, sY: f32) -> mat2x2f {
  return mat2x2f(
    sX, 0,
    0, sY
  );
}

fn rotationMatrix(theta: f32) -> mat2x2f {
  let cosT = cos(theta);
  let sinT = sin(theta);
  return mat2x2f(
    cosT, sinT,
    -sinT, cosT
  );
}

@vertex
fn vs(@builtin(vertex_index) index: u32) -> @builtin(position) vec4f {
  // Step 1: Scale
  var transformed = scaleMatrix(0.5, 0.5) * star[index];

  // Step 2: Rotation
  transformed = rotationMatrix(radians(45.0)) * transformed;

  // Step 3: Translation
  let tX = 0.2;
  let tY = -0.1;
  transformed += vec2f(tX, tY);

  return vec4f(transformed, 0.0, 1.0);
}

We can notice a first advantage of using matrices: the code for applying two different transformations, scale and rotation, is always a matrix product.

The scaleMatrix and rotationMatrix functions build the matrices to apply in the form $f(x, y) = M \cdot p$ , where $p$ is a column vector. The shader code follows this convention and applies the products in the form M * p rather than p * M.

In general, in WGSL vectors are considered neither rows nor columns; however, both expressions M * p and p * M are valid. In order for the product to always be well-defined, vectors are interpreted as rows or columns depending on the order in which they appear in the product with a matrix. The two expressions are computed as follows:

\begin{align*} M \cdot p = \begin{bmatrix}a & b \\ c & d \end{bmatrix} \begin{bmatrix}x \\ y\end{bmatrix} = \begin{bmatrix} ax + by \\ cx + dy \end{bmatrix} \\ p \cdot M = \begin{bmatrix}x & y\end{bmatrix} \begin{bmatrix}a & b \\ c & d \end{bmatrix} = \begin{bmatrix} ax + cy & bx + dy \end{bmatrix} \end{align*}

The two results do not differ merely in being a row vector rather than a column vector (irrelevant in WGSL), but the elements themselves are different! For this reason, it is essential to establish a convention and always apply it consistently.

As with the code example shown, in the following sections we will always use matrices in the form M * p, paying attention to the column-major layout in the matCxRf constructors.

#Composing transformations

The updated shader version implements the formula:

R \cdot (S \cdot p) + t

where $R$ is the rotation matrix, $S$ is the scale matrix, $p$ is the vertex to transform, and $t$ is the translation vector. The logical sequence of steps is:

The point $p$ is scaled via matrix product with $S$ , obtaining a first intermediate point.
The computed point is rotated via matrix product with $R$ , thus obtaining the second intermediate point.
Finally, the second intermediate point is translated via vector addition with $t$ , thus obtaining the final result.

In particular, the transformations $R$ and $S$ are performed in two separate steps; let’s now see how to unify them. In general, the matrix product has the associative property, that is:

A \cdot B \cdot C = A \cdot (B \cdot C) = (A \cdot B) \cdot C

Leveraging this property in the shader formula ( $p$ is a column matrix), we can write:

(R \cdot S) \cdot p + t

Setting $W = R \cdot S$ , we can simplify to:

W \cdot p + t

The result is significant: the single matrix $W$ is equivalent to the transformations $S$ and $R$ in sequence, that is, it represents their composition. At this point, we can further simplify the shader code:

fn scaleMatrix(sX: f32, sY: f32) -> mat2x2f {
4 collapsed lines
  return mat2x2f(
    sX, 0,
    0, sY
  );
}

fn rotationMatrix(theta: f32) -> mat2x2f {
6 collapsed lines
  let cosT = cos(theta);
  let sinT = sin(theta);
  return mat2x2f(
    cosT, sinT,
    -sinT, cosT
  );
}

@vertex
fn vs(@builtin(vertex_index) index: u32) -> @builtin(position) vec4f {
  var worldTransform = rotationMatrix(radians(45.0)) * scaleMatrix(0.5, 0.5);

  // Step 1: Scale + Rotation
  var transformed = worldTransform * star[index];

  // Step 2: Translation
  let tX = 0.2;
  let tY = -0.1;
  transformed += vec2f(tX, tY);

  return vec4f(transformed, 0.0, 1.0);
}

The advantage of this approach may seem limited in our example, since the overall number of operations does not change. However, in real cases, worldTransform is usually a shader parameter and, consequently, is not recomputed for each vertex.

Generalizing, suppose we want to apply in order a series of linear transformations $M_1, M_2, \ldots, M_n$ to a point $p$ :

M_n \cdot \ldots \cdot (M_3 \cdot (M_2 \cdot (M_1 \cdot p)))

Repeatedly leveraging the associative property, we can represent the composition through a single matrix $W = M_n \cdot M_{n-1} \cdot \ldots \cdot M_1$ and use it directly:

W \cdot p

By doing so, the single matrix $W$ encodes the original sequence, abstracting the vertex shader from the type, number, and order of transformations to apply. 😮😱

Unlike multiplication of real numbers, multiplication of square matrices generally does not have the commutative property, that is, $A \cdot B \neq B \cdot A$ . The order in which transformations are performed is relevant, as we had seen intuitively in the previous article. It is important to note that the order in which matrices appear in the definition of $W$ is the reverse of the order in which the transformations are “applied.” The product $M_n \cdot M_{n-1} \cdot \ldots \cdot M_1$ is equivalent to transforming first with $M_1$ , then with $M_2$ , and so on.

#Conclusions

In this article, we introduced matrices and some of their properties. We saw that scale, rotation, and shear are linear transformations and therefore can be expressed in the form $M \cdot p$ , where $M$ is a square matrix of order $2$ .

However, so far in the discussion, we have intentionally left translation aside. As we will see, it is not possible to represent a translation in the plane with order $2$ square matrices!

How to solve this problem? The answer in the next article, dedicated to projective spaces and homogeneous coordinates.

Previous Geometric transformations