Change of Coordinates and Applications to View Matrices
This article introduces the mathematical derivation of the lookAt view matrix commonly used in computer graphics APIs like OpenGL.
As part of the graphics pipeline, the purpose of this matrix is to perform the view transformation, converting global world coordinates into a coordinate system defined by position and direction of an observer.
The derivation is built from from fundamental linear algebra principles, beginning with the interpretation of the matrix-vector product as projections onto an orthonormal basis. This leads to formal treatment of coordinate system changes using change-of-coordinates matrices. By applying these concepts, we construct a final 4×4 matrix, revealing the precise geometric meaning of both its rotational as well as translational components.
This article introduces the derivation of the lookAt matrix in OpenGL, which is a 4×4-transformation matrix of the form
MlookAt=(cam←worldP0T3×11)
Here, cam←worldP denotes a change-of-coordinates matrix. While T3×1 is often referred to as a translation vector - that is, a displacement vector - we will later see in the view matrix that this displacement also involves rotation.
The purpose of the lookAt-matrix is to transform coordinates from world space to camera space, effectively re-expressing world coordinates relative to the camera's point of view:
vcam=MlookAtvworld
We begin by examining the structure of matrix-vector products, where the matrix rows are interpreted as the vectors of an orthonormal basis1 in R3.
Next, we introduce the concept of coordinate vectors and explain how change-of-coordinates matrices are used to map vectors from one coordinate system B into another coordinate system C.
Building on these concepts, we conclude by deriving the explicit form of the lookAt matrix to map coordinates between
Vworld and Vcam.
To complement the theoretical derivation, we provide an interactive application as a hands-on experience, enabling readers to visualize the effects of these transformations in real-time.
Let u,v,w be the orthonormal vectors of the standard basis ε in R3
u=100,v=010w=001
Let p=pxpypz be an arbitrary vector in R3.
We can write p as the matrix-vector product Ax=p, where A is a 3×3 matrix whose rows represent the vectors u,v,w. Since A=AT=I3, it immediately follows that x=p.
The resulting components px,py,pz represent the scalar projections of p onto the axes of the standard basis.
Since u is a normalized vector, multiplying px with u yields the parallel component of the orthogonal projection of p onto u.
These projections can then be used to reconstruct p as a linear combination of the basis vectors u,v,w, e.g. for u:
Before we generalize this construction to an arbitrary orthonormal basis, it is worth noting that the scalar projections - that is, the weights in the above linear combination - are commonly referred to as the components of the corresponding coordinate vector. For example, the coordinate vector of p relative to the standard basis ε is denoted by2
[p]ε=px100+py010+pz001=pxpypz
Coordinate-Vector Definition
Let B={b1,b2,…,bn} be a basis for the vector space V=Rn.
Then, every vector x∈V can be written uniquely as a linear combination:
x=c1b1+c2b2+…+cnbn
The vector
[x]B=c1c2⋮cn
is then called the coordinate vector[📖LLM21, 256] of x relative to B.
Thus, the vector p′=Ap contains exactly the scalar projections of p onto the axes of the orthonormal basis, and the original vector can be reconstructed as a linear combination of the basis vectors scaled by those projections. □
Multiplying the contained coordinate vectors with the coordinates of v relative to B then yields the coordinates of v relative to C, as per rules of linear transformation.