photo © 2008 jlwelsh | more info (via: Wylio)
Perspective perspective…when doing 3D stuff you always hear about perspective.
What is perspective? Well…in the real world, we all see in perspective. It’s just the normal way of seeing things, when you look around.
Perspective is that thing where objects that are away appear to be smaller than those that are close to you.
Perspective is that thing where if you’re sitting in the middle of a straight road (watch out for cars), you actually see the borders of the road as two converging lines.
That’s perspective. We want perspective badly in 3D. Otherwise the world won’t look real. It won’t look 3D.
There’s an additional layer of complexity here. As what we’re doing with 3D is to actually take the 3D world, with objects that are defined in 3D with vertices, triangles, etc…, and want to view that scene using a 2D device, such as the monitor.
What happens is that we want to “project” the 3D world onto the 2D surface of the monitor.
That’s what we call “Perspective Projection”. Which simply means to project the 3D scene on the 2D screen, following the laws of perspective.
Isn’t That Hard to Code?
Not really. But first, let’s look at what we intuitively want to achieve. Imagine that your monitor is not really a monitor. Let’s say it’s a window with a glass, looking out into some real 3D scene, that sits beyond the monitor.
You’re on this side of the monitor, observing, and keeping your head solidal (fixed) to the monitor.
Let’s say you get a butterfly sitting on the inside, in the center of the monitor. You’ll see it pretty large. And in the center. Then the butterfly flies away, following a straight trajectory, perpendicular to the monitor. You’ll see it becoming smaller and smaller. And it will remain in the center of the screen as it flies.
Now let’s say that your butterfly is back on the monitor glass, but this time it’s sitting on a side of the monitor. It flies away again, perpendicularly to the glass surface. Even this time you’ll see it become smaller and smaller, but you’ll also see it converge towards the center of the monitor, as it gets farther away.
That’s perspective. Far objects get smaller, and they get towards the center of the projection as they get farther away.
Cool! Now Let’s Do That With Math
Sure. Let’s say xP and yP are the coordinates of the “projected” position of our butterfly on the screen.
And let’s say that the “world” position in 3D of the butterfly is xW, yW, zW. That’s where the butterfly is actually positioned, in the 3D scene beyond the screen.
Let’s use a 3D coordinate system for world coordinates, where the x axis points to the right, y points up, and positive z points inside the screen. The origin is where our eye sits. So, the glass of our screen will be on a plane orthogonal to the z axis, at some z that I’ll call zNear.
We want to get the projected positions xP and yP, by dividing the world positions xW, and yW, by zW.
xP = K1 * xW / zW
yP = K2 * yW / zW
K1 and K2 are constants that are derived from geometrical factors such as the aspect ratio of your “glassed window” (your screen) and the “field of view” of your eye, which takes into account how wide-angle your eye is.
You can see how with this transform, those (xW, yW) pairs, that that end up transformed as (xP, yP) that fall on the sides of the screen, get then pushed towards the center, as distance from eye (zW) increases. While those pairs (xW, yW) that end up transformed as (xP, yP) that are closer to the center of the screen (0, 0), are much less affected by distance from eye (zW), and tend to remain around the screen center.
So, that’s what we want.
This division by z is the famous “perspective divide”.
How Do You Do That With Matrices?
On first sight doing perspective projections with matrices is tricky. This is because a matrix transform is a “linear transform”: the transformed vector components are simply linear combinations of the input vector. Linear transforms only allow to do translations, rotations, scaling, and skewing. They don’t allow operations like the perspective divide, where a component gets divided by another component.
Now, remember that we usually represent 3 dimensional coordinates with 4 dimensional vectors of the form (x, y, z, w), where w is usually 1. The solution to the matrix based perspective divide issue, is to use the 4th coordinate w in a creative way, by storing the zW coordinate, into the w coordinate of the transformed vector.
The other components of the transformed vector are the pre-multiplied version of xP and yP by zW.
Therefore we want the following transform:
xW -> xP' = xP * zW = K1 * xW
yW -> yP' = yP * zW = K2 * yW
zW -> zP' = K3 * (zW - zNear)
That’s definitely possible to do with a linear matrix transform, as the transformed vectors are a linear combination of the world vector to transform.
After that’s done, the actual transformed xP and yP values get obtained by dividing the transformed x, y, z, components by w.
This gets us:
xP = K1 * xW
yP = K2 * yW
zP = K3 * (zW - zNear) / zW
which is exactly what we want.
Clip Space and Normalized Device Coordinates
Now, Molehill expects you to use a matrix in your Vertex Shader that transforms vertices to a special space where:
(x, y, z, w) = (xP', yP', zP', zW)
with xP’, yP’, zP’ and zW defined as above, and where constants K1, K2 and K3 are chosen so that xP and yP of all visible points in the 3D world are in the range (-1, 1), and zP falls in the range (0, 1). This means that an object falling at the right edge of the screen, once projected, will have xP = 1, and one at the left edge will have xP = -1.
This 4-dimensional space (xP’, yP’, zP’, zW) is called “clip space”, as it’s the one where clipping usually takes place. The (xP, yP, zP) coordinates, after the divide, with range (-1,1) (for xP and yP), and range (0, 1) (for zP) are called Normalized Device Coordinates (NDC).
Molehill, and the GPU, will take the data from the output of you Shader in clip space form, and carry on internally with the perspective divide.
A Note on Clipping
Objects that sit on the “screen glass”, at zW = zNear, get a zP equal to 0. While those that are at some distance that we define as zW = zFar, get transformed in NDC space to zP = 1.
zNear and zFar define the so called “clipping planes”. Objects falling closer than zNear will be clipped (not drawn), as well as objects falling farther away than zFar. Also, objects with xP and yP outside the range (-1, 1) will get clipped.
For simplicity I’m dealing with point objects here. An actual extended object can get partially clipped, as parts of it fall into view, while other parts fall outside it.
How to Do That in Molehill?
Luckily, there’s a simple extension of the Matrix3D class that Adobe has made that helps with this.
It’s “close to official” at this stage, and you can download it here.
The PerspectiveMatrix3D class in that package implements a few simple functions that create the perspective matrix transform that we need:
- PerspectiveMatrix3D::perspectiveFieldOfViewLH
- PerspectiveMatrix3D::perspectiveFieldOfViewRH
Here I’ve been using a world coordinate system where the x points to the right, y is up, and positive z enters the screen. A left handed coordinate system. Therefore I want to use the LH flavor of the matrix function.
Just create a perspective projection matrix:
var projectionTransform:PerspectiveMatrix3D = new PerspectiveMatrix3D();
projectionTransform.perspectiveFieldOfViewLH(fov, aspect, zNear, zFar);
Let me use this matrix, for a simple update to the sample code I had provided in my AGAL Tutorial post.
I’ll simply append (pre-multiply) the projection matrix to the queue of matrix transforms that we were using to position and to rotate the object. I’ll also give a different spin to the rotations, so that the perspective is visible.
var m:Matrix3D = new Matrix3D();
m.appendRotation(getTimer()/30, Vector3D.Y_AXIS);
m.appendRotation(getTimer()/10, Vector3D.X_AXIS);
m.appendTranslation(0, 0, 2);
m.append(projectionTransform);
You can see the application in action here.

And here is the source code for you to download.
Sample Source Code
Still Hungry?
I hope this helped you get into the amazing world of 3D perspective. It’s really a cornerstone of 3D rendering…something you can’t do without.
If you’re still hungry for content and tips about Molehill and 3D rendering in Flash, make sure to subscribe to my newsletter here or from the signup box on the sidebar.
You’ll get the AGAL Reference Card as a free downloadable pdf bonus!