Perspective Projection

Derivation of Perspective Projection Matrix

Posted by Xingyu Wang on June 18, 2021

Perspective projection projects objects on a plane. It has the effect that distant objects are smaller.

projection

If we look at this scene in the negative x axis direction.

projection

Perspective matrix is being used to transform all verteices to the near clipping plane. For example, the picture above has a vertex(x, y, z). After the perspective matrix is applied, we will get projected vertex(x’, y’, z’). The coordinates of projected vertex can be calculated based on similar triangles.

z is inversely propotional to y since near is always positive and z is negative(camera is looking at negative z direction).

Same idea can also be applied to x’

However, it’s impossible to construct a 3 by 3 matrix that can transforms x component in vector (x, y, z) into near multiply x divided by z.


Homogeneous Coordinates

In homogeneous coordinates, a new component w is appended. And any non zero scaler multiply with the point represent the same point.

If we assume the viewing plane is at w = 1. Then all points represented by homogeneous coordinates can be divided by w in order to get the projected point on the viewing screen. In other word, if perspective matrix can output a vector that w = z. Then projected vector position can be retrieved by dividing w after perspective matrix is applied.

After applying perspective matrix:

After dividing by w component:

Therefore, the perspective matrix:

Homogeneous Division:


Perspective Matrix

The problem of this matrix is the lost of depth information during homogeneous division(since w = -z).
If I want to keep depth information, it will be great if z component is -z^2 before homogeneous division.

That gives us:

We need 2 different z values to solve this equation.

frustum

The viewing region(also called frustum) has 2 depth restrictions: near and far. we use n and f to represent these 2 values.
n and f are 2 z values which can be used to solve the equation above.
The solutions of m1 and m2 are:

The perspective matrix:


Perpective Projection Matrix

After applying perspective transformation and homogeneous divison, the frsutum becomes a axis ligned bounding box.

frustum_to_AABB

After that, we can apply orthographic projection matrix which transforms the viewing scene(inside AABB) into canonical view volume.

canonical_view

It’s worth mentioning that homogeneous divison is a mutiplication between a scalar(w) and a matrix.

Homogeneous division can be extracted(scalar factorization), perspective matrix and orthographic projection matrix can be applied first.

Perpective Projection Matrix:


Credits:
Learn WebGL
OpenGL Projection Matrix by songho
video by Brendan Galea, really good explanation